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Abstract 

K^ ' We address the problem of likelihood based inference for correlated diffusion pro- 

^n : 

Q^ , cesses using Markov chain Monte Carlo (MCMC) techniques. Such a task presents two 



in 



interesting problems. First, the construction of the MCMC scheme should ensure that 
the correlation coefhcients are updated subject to the positive definite constraints of the 



J~^ I diffusion matrix. Second, a diffusion may only be observed at a finite set of points and 

the marginal likelihood for the parameters based on these observations is generally not 

• rH . available. We overcome the first issue by using the Cholesky factorisation on the diffusion 

^ • matrix. To deal with the likelihood unavailability, we generalise the data augmentation 

framework of Roberts and Stramer (2001 Biometrika 88(3):603-621) to d— dimensional 
correlated diffusions including multivariate stochastic volatility models. Our method- 
ology is illustrated through simulation based experiments and with daily EUR /USD, 
GBP/USD rates together with their implied volatilities. 
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1 Introduction 

Diffusion processes provide a natural model for phenomena evolving continuously in time. 
One of their appealing features is that they are defined in terms of the instantaneous mean 
and variance of the process. Specifically, a diffusion xt obeys the dynamics of the following 
stochastic differential equation (SDE) 



dxt = fi{t, Xt, 0)dt + o"(t, Xt, 9)dwt, 



(1) 



driven by standard Brownian motion wt- The functions fi{.) and a{.) are termed as the 
drift and the volatility of the diffusion respectively. Throughout this paper we suppress 
the dependence on t to simplify the notation, but the methodology is also applicable to 
time inhomogeneous diffusions. The diffusion process xt is well defined if ([1]) has a unique 



weak solution, which translates into some regularity 



conditions (locally 



ipschi tz with a 



linear growth bound) on ^(.) and (t(.); see chapter 5 of lRogers and Williamsl (| 19941 ) for more 
details. 



We address the problem of modelling several diffusions, denoted by x 



{i} 

t ' 



l,...,d. 



Each diffusion Xj may have a drift fJ-'-^'i-) and volatility o"'^*J (.) of general, yet known, form. 
We also allow for correlations, corr^dxi" ,dx\ )=pij = pji, i ^ j, on the instantaneous 
increments. The use of cross-correlations is quite common when modelling multivariate time 
series, as they may capture effects caused by common factors of the underlying stochastic 
processes. In this paper we illustrate our methodology through two examples of correlated 
diffusions. The first example targets interest rates and bond pricing. Such time series often 
exhibit strong inter-dependencies; for instance, interest rates may correspond to similar bonds 
but with different expiry dates, thus giving rise to cor relations among them. In Section [5] we 



examine a multivariate version of the 



Cox et al. 



(|l985l ) model (CIR), often used for such data. 



The second example considers currency pairs which are known to be correlated, possibly due 
to the common currencies they may represent. Section [6] contains an analysis on EUR/USD 
and GBP/USD data, based o n multivariate versions of stochastic volatility diffusions, such 



as the model of lHestonI (Il993l ). In both examples, the inclusion of correlations in the model 
is essential for two reasons. First, they may affect the parameter estimates of the individual 
diffusions, as well as their precision. Second, they reflect characteristics of the market which 



may be useful in the bond/option pricing procedure. 

We proceed by combining the diffusions x] together into X, 



i4'\ 



M} 



y (with ' 



denoting transposition), so that Xt is a d— dimensional vector for each time t. The diffusion 
matrix of Xt, A, denotes its instantaneous covariance and takes the following form: 



A:-- 






Pidcr^ 



{i}r wWr^ \ 



a- 



P2dCTi2}(.)cTW( 



rWr^2 



(2) 



ctW(.)^ J 



The diffusion process Xt is defined through the following multi-dimensional SDE 



dXt = M{Xt, e)dt + s(Xt, e)dWt, 



(3) 



where Wt is a d— dimensional Brownian motion with independent components, with vector 
valued drift M : [0, +oo) x 5x x 9 ^ SR'' with [M(.)]j = /u''^*^(.), and matrix valued volatility 
(also termed as dispersion matrix) S(-) : [0, -|-cxo) x Sx x G — > Ik'^^'^, where Sx and denotes 
the domain of the diffusion Xt and the parameter vector 9 respectively. The dispersion matrix 
S is a square root of the instantaneous covariance matrix A = SS'. To ensure a unique weak 
solution for Xt, we require a unique weak solution for each x^ and the matrix A to be 
positive definite for all t, Xt, 9. 

Each diffusion xl may be observed, with or without error, at a finite set of points, 
or may be entirely unobserved. The diffusion will be termed as directly observed in cases 
with exact observations on all xl , and partially observed otherwise. For ease of exposition, 
the methodology of this paper is initially presented for directly observed diffusions, and 
adaptations to partial observation regimes, as in multivariate stochastic volatility models, are 
provided when necessary. Similarly, we consider observations of the entire vector of Xt at each 
time, although this assumption can easily be relaxed. We denote the times of observations by 



tfc, k = 1, . . . ,n, and the data with Y = lY). = Xt 



(x 



{1} 



Ad}. 



k = l 



...,nj. 



Our 



aim is to draw likelihood based inference for the parameter vector 9 given these observations. 
The task of inference on diffusions observed discretely in ti me is generally no t trivial and 



has received a remarkable attention in the recent literature; see lSorensenI (120041 ) for a recent 
review. The main problem is that the likelihood is generally not available except for a few 
cases. This has stimulated various techniques based on likelihood approximations. Appr oxi- 



mations may be analytical (lAi't-Sahalia 



a refinement of this technique 



20051). or si i nulati on based; see iPedersenI (j 19951 ) or 



Durham and GallantI (|2002l ). They usually approximate the 



likelihood in a way so that the discretisation error can become arbitrarily small, although the 
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methodology developed in 



Beskos et al. 



(J2006al ) succeeds exact inference in the sense that it 



allows only for Monte Carlo error. 

We shall adopt a Bayesian approach using Markov chain Monte Carlo (MCMC ) method. 



Since diffusions are not completely observed, it is natural to use data augmentation (JTanner and Wong 



19871 ). treating the segments of diffusion sample path (or a suitably fine app roximation t o 



this) as missing da ta. Initial MCMC s chemes of this type were introduced by 



ment of 



Eraker (2001) and 



Elerian et al. 



Elerian et al 



2001 



jond tea), 



(|200ll ). However, as noted in the simulation based exper i- 



), and established theoretically bv [Roberts and Stramerj (|200ll ). 



the algorithms introduced in these initial implementations of MCMC in this context degener- 



ate as the number of imputed point s increases. The prob 



sions with the repar ametrisation of 



sation is provided by 



Golightly and Wilkinsonl ( 



e m ma y be overcome for scalar diffu- 



Roberts and Stramerl (l200lh . An alternative repara r netri 



2003), see also 



Golightlv and Wilkinsonl (|200fil ) 



for a sequential approach, which can in principle be applied in principle to any diffusion. 

However, the adaptation of such MCMC scheme to multivariate diffusions introduces 
additional issues. The task of updating the covariance matrix A is generally not trivial, as 
its full conditional posterior is most of the times intractable, and the use of Metropolis steps 
is inevitable. It is therefore crucial, especially for high-dimensional diffusions, to update the 
covariance matrix componentwise as the discrepancy between proposed and current moves is 
increasing in d. This introduces the problem of preserving the positive definite structure of 
the diffusion matrix A. Note that drawing samples from the posterior of covariance matri- 
ces, which may not necessarily be diffusion matrices, is a ge neral MCMC issue and us ually 
requires appropriate ma trix decompositions; see for example 



Pinheiro and Bated (|l996l ) and 



Daniels and KassI (|l999l ). 



The contribution of this paper is two-fold. First, we introduce a natural and general 
framework for sampling diffusion matrices in a MCMC environment. This framework is based 
on the Cholesky factorisation of A and enables us to define S explicitly. The MCMC algo- 
rithm may then be appropriately designed to provide samples from the posterior of S, which 
can be transformed to A at any time through the Cholesky decomposition. This framework 



may be coupled with any of t h e prev ious 



such as those of 



Beskos et al. 



(|2006al ) or 



y mentioned likeli hood approximation techniques. 



Ai't-Sahalial (J2005l ). to perform Bayesian inference 



for the parameters of the multi-dimensional diffusion. Second, we offer a full and stand 
alone MCMC scheme which cor nbines the Cholesky d e comp osition with the reparametrised 



data augmentation approach of 



Roberts and Strameij (J200ll ). This scheme may be used for 



parameter estimation of several multivariate diffusion models including stochastic volatility. 
The use of data augmentation is justified by its convenient property to be applicable at both 
directly and partially observed diffusions. 

The paper is organised as follows: Section [2] describes the structure of a data augmen- 
tation scheme and highlights potential problems regarding the irreducibility of the MCMC 
algorithm. These problems may be tackled with the reparametrisation of this paper which 
requires the Cholesky factorisation of the diffusion matrix, presented in Section [3j Specific 
MCMC implementation details are given in Section U] and the methodology of this paper is 
illustrated through simulated data in Section [H and on daily EUR/USD, GBP/USD cur- 
rency pairs in Section [6l Finally, we summarise in Section [7] adding some discussion and 
links to some other relevant work. 

2 Data augmentation and degeneracy issues 

2.1 The problem in practice 

Data augmentation scheme bypasses the problem of simulating directly from the posterior 
'7t{9\Y), which is typically unavailable for discretely observed data. The idea is to introduce 
a latent variable X that simplifies the likelihood C{Y; X, 6). We use the following two steps: 

1. Simulate X conditional on Y and 9. 

2. Simulate 9 from the augmented conditional posterior which is proportional 
to 

C{Y;X,9)7r{9). 

Our problem can easily be adapted to this setting. Y represents the observations of the 
price process Xt, and X contains discrete skeletons of the diffusion paths between Y. Thus, 
X and Y constitute the augmented dataset Xis, i = 0, . . . ,T/6, which is a fine partition 
of the multivariate diffusion Xt with 5 controlling the amount of augmentation. Based on 
this partition the likelihood can be approximated, for example via the Euler-Maruyama 

approximation 

T/5 

C''{Y;X,9) =l[p{X,s\X^,_i)s), 
^.5|^{i-i)5 ~ AA(X(,„i)5 + 5M(X(,„i)5,0),M(X(i_i)5,e)) , (4) 



which is known to converge to the true likelihood C{Y;X,9) for small 5 (jPedersen 

5 



Another property of diffusions relates A{Xt,6) with the quadratic variation process. 
Specificany it is well-known that 

T/5 

m 

<5- 



1/0 J' 

lim V] [Xis - Xu-i)&) [Xis - Xu_i)s)' = / A{Xs, 9)ds a.s. 

i=l ''^ 



(5) 



The solution of the equation above determines the diffusion matrix parameters exactly. 
Hence, there exists perfect correlation between these parameters and A' as (5 — > 0. Thus 
for the theoretical algorithm which imputes the entire X path, the MCMC algorithm is 
reducible. In practice this means that as the proportion of imputed data points increases 



mixing probl ems for the MCMC chain b ecom e progressive 



first noted in 



Roberts and Strameii (J200l|) and 



Elerian et al. 



worse This phenomenon was 



(J200lh . As would be expected, 



the EM algorithm suffers from the same problem. 

2.2 Measure theoretic probability viewpoint 

In this section, we explore the problem from a different angle, through a slightly more 
rigorous look at the likelihood. Let Xt be a diffusion that satisfies ([3]) and assume Xq = Yq 
and Xi = Yi, Y = {Yi,Y2). Denote the probability law of X by Pg and that of its driftless 
version, 

dMt = a{Xt,e)dWu 

by Qe. To write down the likelihood, we can use the Cameron-Martin-Girsanov formula 
which provides the Radon-Nikodym derivative of P51 with respect to Q51: 



dFa 



G{X,M,A) =exp|/" [A{Xs,e)-^M{Xs,e)\'dXs 



M{Xs,eyA{Xs,e)-^M{Xs,6)ds 



Note that the expression above contains stochastic and path integrals for which an analytic 
solution is generally not available. However, given a sufficiently fine partition of the diffusion 
path, they can be evaluated numerically providing an approximation of the likelihood which 
is equivalent to (JH). 

Now assume for a moment that under Qg the marginal density of Y with respect to 
d— dimensional Lebesgue measure Lebd{Y), is known and denote by fMiX]^)- The domi- 

6 



nating measure Qg can be factorised in the following way 

Qe = Ql xLeba{Y)xf m{Y; 9), (6) 

where QY is the measure Qg conditioned on the observations Y. We can now write 

-(X--,y) = G{X,M,A) X fM{Y;e). (7) 



QJ X LebdiY) 

The expression in ([7]) provides the likelihood for the latent diffusion paths X*"** and the 
parameters 9. However, this likelihood is not valid because its reference measure, Q^, depends 
on parameters. Furthermore, since the volatility parameters are identified by the quadratic 
covariation process, the measure Qe is just a point mass. Consequently, the measures Qe 
are mutually singular and therefore so are ¥g. Hence, inference for both X"^^'^,9 is not 
possible using a common cr— finite dominating measure. In the next section, we specify 
an appropriate transformation of the diffusion that allows a likelihood specification with 
respect to a parameter-fre e dominating measure . This transformation may be viewed as a 



generalisation of the one in 



Roberts and Strameij (|200ll ). The transformed diffusion has unit 



volatility, thus the problems induced by the quadratic variation property of ([5]) are implicitly 
addressed. 

3 Likelihood specification 

3.1 A Cholesky factorisation of the diffusion matrix 

Consider the multi-dimensional SDE of ^ with the diffusion matrix yl of ([2|). The d x d 
matrices A and S are linked through A = SS', therefore S is not unique. However, it 
is crucial to define S explicitly and establish a 1-1 mapping with A, as each one of these 
two matrices may be more convenient for different reasons. The likelihood, defined either 
through the Euler-Maruyama approximation in ^ or through Cameron-Martin-Girsanov's 
formula in ([7]), is expressed in terms of A, which is also the main target of inference. On the 
other hand A is a positive definite matrix, whereas the only assumption made on S requires 
its full rank. Hence it is generally more convenient to work with T, in the context of a 
MCMC algorithm. Moreove r, as mentioned in the previous section, the generalisation of the 



Roberts and Stramei 



(|200ll ) reparametrisation involves a transformation to unit volatility 



which will naturally be based on S. 
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In this paper, we define T, using the Cholesky decomposition of A. Let Sx{Xt,0) = 
diag{a'^'^' [Xt, 9)}. The diffusion matrix may then be factorised in the following way 

A{Xt,e) = Sx{Xt,9)RSx{Xt,e), 

where R is the correlation matrix. One may define S as the product of Sx with the Cholesky 
decomposition of R, say C. But the elements of C will not have the general Cholesky structure, 
since R has the additional property of being a correlation matrix. To eliminate such problems 
we write each ai{Xt,0) as 

a^'HXt,e) = c^f^'HXt,9), Vf, (8) 

for some positive constants Ci. This imposes no restrictions as we can always set /^"^'{Xt, 9) = 
a^^' {Xt,9)/ci^ see Section lSTil for such an example. Now, based on Fx{Xt, 9) = diag{f^^'{Xt,9)}, 
we can use ^ to obtain an alternative decomposition of A, 

A{Xt,9) = Fx{Xt,9)VFx{Xt,9), 

where F is a general symmetric positive definite matrix with 

I Pij CiCj , 1 T^ J- 

The Cholesky decomposition of V, denoted hy C {V = CC), may now be used. The 
dispersion matrix T,{Xt,9) is defined as 

^{Xt,9) = Fx{Xt,9)C. (10) 

In coordinate form, T, may be written as 

( 0, j> i. 

The only restriction on the constants Cij requires compatibility with the Cholesky decom- 
position, which translates on positive diagonal entries Ca. As we mention in 14.21 this is 
particularly convenient in a MCMC environment and specifically for componentwise updates 
of Y^{Xt,9) parameters. The Cholesky decomposition establishes the 1-1 mapping between 
S and A and ensures that the entire space of diffusion matrices as A is covered. 



3.2 Transformation to unit volatility 



In Section [21 the ne ed for a reparametrisation w as highlighted in order to avoid degenerate 



MCMC algorithms. 



Roberts and Stramer (2001 



provide a solution to the problem for scalar 



diffusions, which involves a transformation to unit volatility. H owever, in more t han one 



Ait-Sahahal (|2005|). When 



dimensions such a transformation does not always exist, as noted 

su ch a transfo r matio n is available the diffusion is said to be reducible, a term introduced 

by lAi't-Sahalial (J2005l ) who also provides a necessary and sufficient condition for reducibility: 



diffusions with non-singular S(Xi, 9) are reducible if and only if 



d[^iXt 



d[j:{Xt,er'[ 



ik 



dx] 



{k} 



dx 



{j} 



V i, j, /c G {1, . . . ,d}, with j < k 



(11) 



Not all SDEs with diffusion matrix A as in ([2]) or dispersion matrix S as in (jlOp are 
reducible. In this section, we restrict our attention to diffusions with 

cT«(Xi,^)^a«(x«,e), (12) 

for which we prove the reducibility. This is established by the following proposition: 

Proposition 3.1 Let X be a d-dimensional diffusion which obeys the following SDE: 

dXt = M{t, Xt, e)dt + S(t, Xt, 9)dWt. 

Furthermore, assume that 

T.{Xt,e) = F,{Xt,e)c, 

where Fx{Xt, 9) = diag{f^^>{xY, 0)} and C is a lower triangular matrix with positive diago- 
nal elements. The diffusion X can then be transformed to one with identity diffusion matrix. 
In other words X is reducible. 

Proof: See Appendix. 

The next proposition provides explicitly a transformation to unit volatility. It may be viewed 
as an alternative proof of proposition 13.11 

Proposition 3.2 Consider the setting and the diffusion Xt of proposition [Ql Suppose that 
there exist g^"^' {xt{i} , 0) for i = 1, . . . , d with continuous second derivatives, so that 

dg^^{xf^,e) 1 



dxf^ /W(xP,^)' 



l,...,d. 



and let Gx{Xt,6) = ig^^'x] ,9), . . . ,g^<[xl ,0)] . Consider the transformation 

H{Xt,e) = [h^'^Xt,e),...,h^''Hxt,9)y = c~'Gx{Xt,e). (is) 

The diffusion Ut = H{Xt,9) has then unit volatility. 
Proof: See Appendix. 



The transformation of ()13p may be used to specify the hkehhood under an appropriate 
reparametrisation which wiU ensure a non - decreasing efficiency, of the data augmentation 
MCMC scheme, in the level of augmentation. Notice that the transformation of (J13p to unit 
volatility is not unique. This is not necessary for our methodology, in fact we only require its 
invertibility which is ensured as long as each gi{xt{i},9) is itself invertible. We present this 
reparametrisation in the Section [3.31 whereas in 13.41 we show how to relax the assumption of 
(I12p to handle multivariate stochastic volatility models. 



3.3 Reparametrised likelihood 

Consider the diffusion that satisfies the SDE of ([3]) where the drift M(.) and S satisfy the 
appropriate conditions so that Xt has a unique weak solution and Ito's lemma can be applied. 
Furthermore, assume that 

j:{Xt,9) = Fx{Xt,9)C, 

where Fx{Xt,9) = diag{f^''j{x^ ,9)} and C is a lower triangular matrix with positive di- 
agonal elements. For ease of illustration let the entire vector of Xt be observed at each 
time and denote the times of observations by t^, k = 0, . . . ,n, and the data with Y = 
< Yfc = Xfj. = [x] , . . . ,x] )', k = 1, . . . , n >. We will define the likelihood for a pair of suc- 
cessive observations, (Yfc_i, Yfc). Due to the Markov property of diffusions, the full likelihood 
is just given by the product of all pairs of consecutive observations. Without applying a 
reparametrisation, the likelihood can be defined through d?]). However, as discussed in [21 
this likelihood is problematic because it is written with respect to a dominating measure 
that depends on parameters. The aim of the reparametrisation is to obtain a likelihood with 
a parameter-free dominating measure. 

The first step of the reparametrisation requires a transformation Ut = H{Xt,9) = 
(n"j , . . . ,iii j) , so that the diffusion matrix of Ut is the d— dimensional identity matrix. 
As established by proposition 13. 1[ such a transformation does exist and can be obtained 
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explicitly by (J13p . The SDE of the r— th coordinate of the transformed diffusion U will be 
given by: 

(iuf ^ = fiPiUt, 9)dt + dwl''\ r = l,...,d, 

with 

where Xt may replaced with H~^{Ut,9) so that the SDE is expressed in terms of Ut- If 
we use the Cameron-Martin-Girsanov formula in a similar manner as in Section 12.21 we can 
write the likelihood as 

or equivalently 



([/™-,y) = G{U,^lu,Id)x^f{Yi'-Y,l„Id)\J{Y, 



where W^ is just Wiener measure conditioned on the transformed observations Y^=H(Y, 6), 
AA( y, V) denotes the Gaussian density of Y under mean and covariance V, and J{Y, 6) is the 
Jacobian term from the transformation H(Y,9). The dominating measure of the likelihood, 
W , reflects the distribution of d independent Brownian bridges with Y as endpoints and 
therefore depends on parameters. For this reason we introduce a second transformation 

z^'i{s) = u^'^is) , tfc_i <s<tk, (14) 

tk — tk-1 

for alH G {1, . . . , d}, which centers the bridge to start and finish at and preserves the unit 
volatility. Let Z = (2:''"'^-', . . . , z'-'^'^ and the function U = ri{Z) to be the inverse of [TH The 
SDE for Z becomes 

dzl'^ = njj^{7]{Zt),9)dt + dwi'\ V i G {1, . . . , d} 

The likelihood may now be written as 

-Tip 

-^^^—^-—{Z^^^,h{Y,9)) =G{v{Zt),Mu,I,)xAf{Yi'-Yii„I,)\J{Y,9)\, (15) 
where 



Mu=[fih7MZt),9),...,f,}j"^^iviZt),9)^ 

The dominating measure of the likelihood provided by [15] does not depend on any parame- 
ters, being the product of d independent Brownian bridges that start and finish at 0. The 



11 



likelihood of (jlSp may be used to construct an irreducible MCMC scheme which will not 
degenerate as we increase the amount of augmentation. The stochastic and path integrals 
involved cannot be solved analytically but they can be evaluated numerically given a suffi- 
ciently fine partition of the diffusion path. Note also that, as a result of these transformations, 
inference will now be based on Zt rather than Xt. However, the posterior draws of Zt may 
be inverted to provide samples from the posterior of Xt. 



3.4 Multivariate stochastic volatility models 

In the previous subsection we assumed a diffusion with SDE that satisfies (|12p so that the 
transformation of (J13p is directly applicable. However, there exist interesting diffusion models 
outside of this class with a broad range of applica tions. One famous ex ample of such models is 
provided by stochastic volatility; see for example 



stochastic vo 
and 



atilit y models, including those of 



Ghvsels et a^ 



19961 ) . Most diffusion driven 



Hull and Whitel (119871 ) . IStein and SteinI (|l99lh 



(16) 



HestonI (jl993l ). belong to the following general class of 2— dimensional SDEs 

^xt\ _( lixivuQ) \ ^ / o^ivuQ) \idht 

dvt J \ fJ.v{vt,0) J \ ^ (^v{vt,0) J \ dwt 

where bt and wt are correlated standard Brownian motions, xt usually denotes the log price, 
whose volatility is provided by another diffusion vt- 

Diffusioris that satisfy SDEs as in (J16p cannot generally be transformed to unit volatility 



( Ai't-Sahalia 



to construct an irreducib 
noted in 



20051 ). as the reparametrisation of 13. 31 requires. Nevertheless, it is still possible 
e data augmentation scheme to estimate their parameters. As 



Chibetal 



20051 ) the conditional likelihood of xt, given vt, is available in closed 



form and therefore only th e paths of vt need to b e imputed to approximate the likelihood. 



KalogeropoulosI (120071 ) . it suffices to transform vt itself to unit 



Consequently, as shown in 
volatility. 

This idea may be coupled with the Cholesky factorisation to handle multivariate stochas- 
tic volatility models. We illustrate this for the case of a bivariate Heston model. The scalar 
Heston model can be written as 

dxt = ifJ-x- 2^? ) dt + ^/v'tdbt, 

dvt = n{^v — vt)dt + (Ty/vtdwt- 

where bt and wt are correlated. We can re-write the top equation, by setting c = y/Jhi, to 

dxt = I ^ix 

12 



i»?)* + .^.B,. 



Based on the formulation above, a bivariate Heston model may be written as a 4— dimensional 
diffusion Xt = (v] ,vl ,xl ,x] J , with x] ,xl denoting the log-prices, and v} ,v} 
their volatilities. The diffusion matrix now has the general form of ([2]) all of the components 
of Xt may be correlated. Since ([8]) holds for each component of Xt, we can define the 
dispersion matrix of Xt as in (fTOl) 



/ d.f > \ 



dvl 



{2} 



dx 
dx 



{1} 

t 

{2} 



K'2 ifJ-2 
/^3 



.i^A \ 



,{2} 






dt + F,{Xt,9)CdBt, 



(17) 



V f.,-uvrr J 



where now Bt is a 4— dimensional Brownian motion with independent components. 



Fx{Xt,0) = diag < 



't 1 



;{2} . 



,{1} 



,{2}1 



^1 



Ai2 



and C is the lower triangular Cholesky matrix whose entries Cij may be seen as a 1-1 
transformation of parameter vector containing the correlations pij, and also di, cJ2, y^/^T ^-^^d 

Regarding the likelihood, consider again a pair of successive observations, lA,._i,lfc with 



Yfc = (y^. , y^ ) , for Xj ,xl . Conditional on f ^ , t;^ , and therefore also on their corre^ 



sponding Brownian components b] 
mean 



{1} ^{2} 

, the likelihood for Y^ is a bi-variate Gaussian with 



,,{3} rt^, 

{4} , rtfc 



and covariance matrix 






d^ + ^31 It, 
ds + Qi /,t, 






{2} 

'JJ_ 
y-2 



-dh 



{1} 



+ ^2 it, 



Hi£d^{2} \ 
Ml """ 

M2 * 



/.t.Cls^rf- 



Ms 



/<^*^_, C33C43 



/t,1i C'33C43 



M3M4 



M3/i4 

{2} 



ds 



^« !tS^l, + cl,Y-^ds 



n 



The integrals above cannot be computed analytically, but the augmented path of f ^ ,Vt 
enables accurate numerical approximations of them. 

The remaining part of the likelihood may be obtained through the reparametrisation 
recipe of Section [3.31 modified according to the observation regime of the volatility. In some 
cases the volatility may be entirely unobserved, leading to a partially observed diffusion. Nev- 
ertheless alternative formulations are available, where information from option prices is used 
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to con s truct exact or noisy vo 



(J2005|) , IChernov and Ghvselsl ([20001) and 



atilit y ob servations; see for example lA'it-Sahalia and Kimmel 



Kalogeropoulos et al. 



(J2007l ) . In the presence of ex- 



act observations the transformations of (J13p and (114^ may be used. Note that transformation 
to unit volatility refers to the 2-dimensional diffusion {vj ,vl )', rather than the entire Xt. 
For the bivariate Heston model it takes the following form 

where 



,{1} 

^t 1 ■ 



,{2} 



and D is a block of C containing the Cij entries with i,j = {1, 2}. If the observations are 
noisy or they do not exist at all, the transformation of (jl4|) may be replaced with 

and the M [Yj^ — Yj^^, I^) \J{Y, 6)\ part of the likelihood should be replaced with the relative 
noise density or removed accordingly. 

The above likelihood specification can be applied to all multivariate stoc hastic volatility 



model s that satisfy the SDE ofll61 For m ore complex models, the frarn ework of 



(J2007l ) or time change transformations of 
the Cholesky factorisation. 



Kalogeropoulos et al 



Golightlv and Wilkinson 



(|2007l ) may be combined with 



4 MCMC implementation 

Based on the likelihood specifications of the previous section, it is now possible to construct 
an irreducible data augmentation MCMC scheme. The algorithm may be divided into three 
parts: the updates of the diffusion paths Z"^^^, the parameters of the dispersion matrix 
Ti{Xt, 6) and those of the drift M{Xt, 0). Generally, the updates of the drift parameters may 
be executed using standard random walk Metropolis techniques, although for some diffusion 
models the full conditionals may be analytically tractable and Gibbs steps may be used 
instead. Hence, in the next two subsections we provide some details regarding the updates 
of the diffusion paths and the volatility parameters. 



4.1 Updating the imputed paths 

There exist several options for carrying out this step and most of them are based on an 

independence sampler. For discretely observed diffusions the augmented path may be divided 
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into n X d diffusion bridges connecting the observed points, and each one of them may be 
updated in turn. The full conditional of Z"*** may be written as 

d^jZrms^Y) = G{r,{Zt),Mu,Id) ^^Z''j} oc G{v{Zt),Mu,Id), (18) 



where fx{Y;A) is the density of Y with respect to the Lebesgue measure under Fg. Note 
that this expression will be slightly different for stochastic volatility models. 

The dominating measure of the likelihood W*^, in other words a Brownian bridge, may be 
used as the proposal distribution for the independence sampler. Based on (jlSp . the algorithm 
will then contain the following steps 

• Step 1 : Propose a Brownian bridge from ifc_i to t^ ■ 

• Step 2: Substitute into i-th dimension and form Z^ . 

• Step 3: Accept with probability: 

G{v{Z:),Mu,Id) 



min < 1 



G{rj{Zt),Mu,Id)}' 
• Repeat for all k = 1, . . .n and i = 1, . . . ,d. 

The algorithm above takes advantage of the transformation to unit volatility and splits the 
path into n x d independent, under the dominating measu re, bridges. Alternative prop osals 



are available such as the diffusion bridges introduced in 



sation framework of 



Durham and GallantI (J2002l ) and 



Delvon and Hul (|2007), which can be adapted i n a M CMC setting through the reparametri 



moves of the paths in the spirit of 



Beskos et al. 



Golightlv and Wilkinsonl (120071) . Another option is to propose local 



2006bl ). This approach may be viewed as a 



random walk metropolis in the space of diffusion bridges. Note however that this technique 
requires bridges with unit volatility, and therefore it can only be used for correlated diffusions 
through the reparametrisation framework of this paper. 

Further increase in the acceptance rate may be achieved by choosing a proposal distri- 
bution which is closer to the target Pg, for example a linear diffusion bridge. Suppose that 
we propose from another diffusion bridge distribution, denoted by L'', with drift L. We can 

now write: 

dFg dFe/dW^ G{r^{Zt),Mu,Id) ..„. 



dL' 



dLo/dM° 



G{v{Zt),L, 



■ Based on ()19p . the corresponding algorithm, termed as method B in 

), will consist of the following steps: 
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2001 



Roberts and Stramer 



• Step 1: Propose a Brownian bridge from tk^i to tk- 

• Step 2: Substitute into i-th dimension and form Z^ . 

• Step 3: Accept with probability: 

G{r]{Z;),Mu,Id)G{rj{Zt),L,Id) 



inin < 1 

1 'G{v{Z:),L,I,)G{v{Zt),Mu,Id) 

• Repeat for all k = 1, . . .n and i = 1, . . . ,d. 

However, low acceptance rates may still occur, especially in sparse datasets. In such 
cases, each bridge may be further split into smaller blocks an d updating strategies b ased 
on overlapping o r random sized blocks may be advocated; see 



Chib et al 



KalogeropoulosI ( 



20071'! and 



(|2005l ) for more details. These techniques may also be used in partially observed 
diffusions, for example in stochastic volatility models, where some components of the diffusion 
may be observed with error or not be observed at all. 

4.2 Updating the volatility parameters 

As mentioned earlier, the parameter updates of the diffusion matrix A{Xt,0) are not trivial. 
Their full conditional posterior is generally not available in closed form, and Metropolis 
steps are inevitable. The construction of such steps has to ensure that the covariance matrix 
structure of A{Xt,9) is preserved. At the same time, it is desirable to achieve a reasonably 
high acceptance rate of the proposed moves for a good mixing of the MCMC algorithm. While 
the former may be implemented by using an appropriate distribution for symmetric positive 
definite matrices, such as the Wishart distribution, it is extremely difficult to guarantee the 
latter, especially for high dimensional diffusions. 

The Cholesky factorisation introduced in this paper may be of help in such cases. Specif- 
ically, the step of updating the constants q, and the correlations pij, with i,j G {1, . . . ,d} 
and i < j, may be replaced by componentwise updates of the Cholesky matrix C. In contrast 
with the correlations pij, the restrictions implied by the symmetric and positive definite dif- 
fusion matrix A{Xt,0) may be enforced on the elements of C in a straightforward manner, 
as only the positivity of the diagonal entries is required. 

Hence, the updates of Cj^ 's may be implemented through standard random walk Metropo- 
lis steps. Note that {ci,pij) and Gij are linked through 

S^{Xt, 9) R S^{Xu e) = F.,{Xt, 9) V F^{Xt, 9) = AiX^O), (20) 
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where R is the correlation matrix and V is defined in Q. It is not hard to see that they 
are hnked with an 1-1 mapping which is the solution of the system in (I20p with d{d + l)/2 
equations and unknowns. Hence, the draws from the posterior of C may be transformed 
back at any time, to obtain draws from the posterior of {ci,pij). 

5 Simulation based experiments 

In this section we illustrate and test our data augmentation scheme on a 3— dimensional CIR 
model. In other words, we consider a 3— dimensional diffusion Xt = {xj jx] ,xl )' with 
linear drift for each component Ki{fii — x^ ), the CIR formulation of the volatility, ai\J x\ , 
and correlations between all the components, pij, i = 1,2, 3, j < i. This model may be useful 
for the analysis of interest rates time series, where the cross-correlations may be substantial. 
Notice that our framework allows for more general drift and volatility formulations but the 
main focus of this simulation experiment lies mainly in the correlations p.y . The dispersion 
matrix of the multi-dimensional diffusion Xt may be defined as in (jlOp . with 



F^{Xt,e)=diag^^'-^^^ ^/-^^^ ^/-^^^ 



and C being the lower triangular matrix from the Cholesky decomposition, whose entries 
Cij, substitute the parameters cjj and pij. The likelihood reparametrisation requires a trans- 
formation to unit volatility which is given by 

Ut = H{Xt,C) = C~^aXt), 

with 



G,(Xi) = (2Vxf^2V4'^2V4'^ 



The second transformation is that of (J14p . and the likelihood may be obtained from (jlSp . 
To complete the model formulation we assign non-informative priors: p{9) oc 6~^ for the 
positive parameters Ki,fii,Cii and p{6) oc 1 for the rest {Cij,i > j). 

We simulated 500 equidistant observations (apart from the initial point) at times {t^ = 
k, k = 0. . . ,n} with t„ = 500. Several MCMC runs, with different numbers of imputed 
points ?n={20, 40, 60, 80}, were examined. This was done to monitor the autocorrelation as 
well as the approximation error of the likelihood in relation with the level of augmentation. 
The acceptance rate of the independence sampler used for the path updates was 98.14%, 
raising no concerns regarding its performance. Figure [1] shows autocorrelation plots for the 
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posterior draws of the C matrix components. There is no sign of any increase to raise 
suspicions against the irreducibihty of the chain. Figure [2] depicts density plots for some 
parameters as well as the log-likelihood which may be seen as an appropriate diagnostic plot 
for the quality of the approximations. Densities for m = 60 and tti = 80 look similar and 
therefore the argument that their level of augmentation is sufficient appears to be plausible. 
The plots of Figure [2] and the results of Table [U which contains summaries of the parameter 
posterior draws for m = 80, are in good agreement with the true values of the parameters. 

[Figure 1 about here.] 

[Figure 2 about here.] 

[Table 1 about here.] 

6 Application: EUR/USD and GBP /USD exchange rates 

The dataset consists of roughly two years of daily exchange EUR/USD and GBP/USD rates, 
specifically from the 3rd of January 2005 to 22nd of December 2006. We denote these rates 
with T-ew/usfi and r^^p/'^^'^ and their logarithms with ye*"'/"^'^ and ysbp/^so! respectively. Our 
dataset also contains the corresponding month implied volatilities constructed from options 
made on the currency pairs. The data are plotted in Figure El 

[Figure 3 about here.] 

We use the implied volatilities of the currency pairs to construct proxies for their ac- 
tual volatilities, denoted with /ye^'"/"*'^ and /ys^'P/'"''^. For simplicity, these proxies are 
assumed to be exact obse rvations of the volatilities . Alte rnative assumptions are possible, 



such as their adjustment (jAi't-Sahalia and Kimmel 



20051 ). or a formulation with noisy ob- 



servations. Table [2] provides several descriptive statistics including the correlation matrix of 
the 4— dimensional time series containing the implied volatilities and the log-exchange rates 

T^ / Tjreur/usd j^rgbp/usd Yeur/usd Ygbp/usd'\ 

[Table 2 about here.] 

Note that some correlations appear to be substantial and should be taken into account in 
the analysis of the data. Hence we fit the bivariate Heston model to the 4— dimensional time 
series Y using the MCMC data augmentation scheme of this paper. Section 13.41 provides 
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details on the reparametrised likelihood for the data. For reasons of model parsimony, we 
only consider correlations between the pairs (/ye«r/«sd^ jyg6p/«sd^ ^^^ ^ye«r/nsd^ y9bp/«.d^^ 

and set the remaining ones {p3i,P32,P4i,Pi2) to zero. This is in line with Table [2] and some 
preliminary analysis which considered all possible correlations. Note that the parameters of 
C that need to be updated are just Cn, C21, C22 and C43, as C33 and C44 are redundant 
and the remaining entries are equal to zero like the corresponding correlations. In other 
words, there exists a 1-1 mapping between the diffusion matrix elements {(Ji,cr2,p2i,Pi3) and 
((711,(721,022,^43). We complete the model by assigning non-informative priors as in the 
previous section: p{9) oc 6~^ for the positive parameters {ki,K2,Pi,P2,Cu,C22) and p{9) oc 1 
for the rest {p3,PA,C2i,C43). 

As before, several MCMC runs with different numbers of imputed points m={W, 20, 40} 
were used. The data, referring to business days, were assumed to be equidistant and the time 
was measured in years. Again, the acceptance rate of the independence sampler used for the 
path updates was particularly high 99.16%. The autocorrelation plots of draws from the 
posterior of the parameters Cii,C2i,C22, and 6*43, in Figure [H reveal no sign of any increase 
in the level of augmentation. 

[Figure 4 about here.] 

Regarding the approximation error due to the discretisation of the diffusion path, the density 
plots from the posterior draws of some parameters and the log- likelihood, in Figure[5l provide 
convergence evidence for the approximating sequence of the data augmentation scheme. 

[Figure 5 about here.] 

Table [3] contains summaries of the parameter posterior draws, where both correlations appear 
to be high. Note that the non-parametric estimates of Table [2] are based on the quadratic 
variation process and are therefore amenable to bias due to the discretisation of the diffusion 
path. On the other hand, the discretisation error of the model estimates may become arbi- 
trary small. The posterior mean or median values provide point estimates of the parameters 
which may be used for option pricing purposes. Alternatively, the samples from their poste- 
rior of the parameters may be used in a Bayesian option pricing framework. In any case, it 
may be useful to take into account the correlated market structure of the log-exchange rate 
and their impled volatilities. 

[Table 3 about here.] 
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7 Discussion 

In this paper we introduced a parametrisation framework based on the Cholesky decompo- 
sition, for handling correlations of multi-dimensional diffusions in a Bayesian MCMC set- 
ting. This framework facilitates componentwise updates of the diffusion matrix, in a way 
so that its positive definite structure is preserved. It may therefore be of substantial value 
in high dimensional diffusion models. The Cholesky factorisation was used in connection 
with data augmentation and therefore applies to both directly and partially observed diffu- 
si ons. In order to overco r ne de generate MCMC algorithms, the likelihood reparametrisation 
of [Roberts and Stramerl (120011 ) was generalised to several multi-dimensional diffusions, in- 
cluding stochastic volatility models, thus providing a stand alone solution to the problem. 
Being a data augmentation scheme, our MCMC algorithm is based on an approximation of 
the likelihood, whose error may become arbitrarily small by simply increasing the level of 
augmentation. 

Nonetheless, the Cholesky factorisation of the diffusion matrix may be coupled with 
alternative, to data augme ntation, tech n iques f or approximating the likelihood. The ex- 



act inference framework of 



Beskos et al. 



(J2006al ) and the analytic likelihood expansions of 
Ai't-Sahalial (|2005l ) provide such examples with appealing properties: the former eliminates 
entirely the error due to the discretisation of the diffusion path, whereas the latter provides 
closed form expressions of the likelihood. On the other hand, their generalisation to partially 



observed diffusion may present major difficulties. 

Apart from the updates of the diffusion matrix par ameters, ou r MCM C a 

(12005|)and 



from other data augmentation schemes, such as those of 



Chib et al. 



gorithm differs 



Golightly and Wilkinson 



(J2007l ). in the proposal distribution of the independence sampler involved in the updates of 
the diffusion pa ths. Under these schemes, t he proposal may either b e the multi-dimensiona l 
bridge of the of [ 



Durham and GallantI (|2002l ). or alternatively that of lDelyon and Hul (|2007l ). 
with the target diffusion matrix. Current work investigates the behavior of all existing ap- 
proaches in different settings regarding the dimensionality of the diffusion, the amount of 
correlation, and the sparseness of the data. 
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A Proofs of propositions 

Proof of proposition 13. It 

The proof is based on he reducibihty condition of (jlip . for which we need the inverse of 

j:{Xt,e) 

In coordinate form the above writes 



Hence, it is not hard to see that the reducibility condition of 

d[j:{Xt,9)-% _d[j:{Xt,d)-% 



Ait-Sahahal (J2005l ) holds because 



dx] dxf 



0, V i,j, k £ {1, . . . ,d}, with j < k 



Proof of proposition 13. 2t 

The diffusion matrix of Ut should be a d— dimensional identity matrix, therefore by Ito's 
lemma we get 

VH{Xt,9)A{VH{Xue)y = Id (21) 

Consider a transformation of the form 

H{Xt,9) = BG^{Xt,9), 

where B is an arbitrary dx d matrix, independent of Xt. 
We can write 

VH{Xt,9) = BDG{Xt,9), 

where DG{Xt,9) is a diagonal matrix with 

[DciXt, 9)lu = /«(xf ^ ^)-\ i = l,...,d. 
Indeed, the A;— th row of VH{Xt,9) equals 

VH{Xt,9)=V iY.Bkjg^'Hxt{j},9)\ = {B^i, . . . , B^d) Dg{Xu9). 
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If we substitute on (|2T|) , using also (|T0|) , we get 

B DG{Xt,e) F,{Xt,9) c a F,{Xt,e) DG{Xt,e) B' = la, 

which since DG{Xt,e)F^{Xt, 9) = F^iX^e) DciXt^O) = /^ becomes 

BCC B' = la, 
which is satisfied if we set B = C~^ . 



25 



List of Figures 



Autocorrelation plots for the posterior draws of the C matrix entries for dif- 
ferent numbers of imputed points (m = 20, 40, 60, 80). Simulated data [27 

Kernel densities of the posterior draws for some parameters (/ii, <T2, P32) and 
the log-likelihood, for different numbers of imputed points {m = 20, 40, 60, 80). 

Simulated data 1281 

Daily EUR/USD and GBP/USD rates (up) and their month implied volatili- 
ties (%) (down) from 3rd of January 2005 to 22nd of December 2006 |29| 

Autocorrelation plots for the posterior draws of the C matrix entries for differ- 
ent numbers of imputed points (m = 10,20,40). EUR/USD and GBP/USD 

exchange rates dataset ISO 

Kernel densities of the posterior draws for some parameters (/ii, ;Ui, o"2, P2I) 
P43) and the log-likelihood, for different numbers of imputed points {m = 
10,20,40). EUR/USD and GBP/USD exchange rates dataset [si 



26 



i^ 
















m=20 

m=40 

m=60 
m=80 








_ . . = 






- ^^.:=C^^<^ 




1 
















— I — 

20 



— I — 
60 



Lag - Cii 



Lag - C21 




— I — 

40 



Lag - C; 




— I — 

60 



22 



Lag - C31 











m=20 

- - m=40 

- rn^eO 

■ - ■ m=80 




w__ 








Lag - C32 



Lag - C33 



Figure 1: Autocorrelation plots for the posterior draws of the C matrix entries for different 
numbers of imputed points (m = 20, 40, 60, 80). Simulated data. 
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Figure 2: Kernel densities of the posterior draws for some parameters (/xi, cr2, p^a) and the 
log-hkehhood, for different numbers of imputed points {jn = 20,40,60,80). Simulated data. 
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Figure 4: Autocorrelation plots for the posterior draws of the C matrix entries for different 
numbers of imputed points (m = 10,20,40). EUR/USD and GBP/USD exchange rates 
dataset. 
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Parameter 


True Value 


Posterior mean 


Posterior SD 


Posterior median 


Ki 


0.2 


0.174 


0.025 


0.174 


K2 


0.15 


0.123 


0.031 


0.121 


1^3 


0.22 


0.223 


0.030 


0.224 


^J'l 


2.5 


2.578 


0.167 


2.571 


fJ'2 


3.0 


2.986 


0.366 


2.951 


^J'3 


2.0 


1.908 


0.094 


1.905 


0"! 


0.45 


0.434 


0.016 


0.434 


0-2 


0.35 


0.372 


0.012 


0.372 


0-3 


0.4 


0.401 


0.014 


0.402 


P2I 


0.45 


0.480 


0.034 


0.480 


P31 


0.35 


0.318 


0.041 


0.319 


P32 


0.55 


0.537 


0.033 


0.538 



Table 1: Summaries of the posterior draws of the model parameters for m = 80. Simulated 
dataset. 
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Mean 


St. 


Deviation 


Median 




jyeur/usd ^ 


100 


0.693 




0.076 


0.708 




jygbp/usd ^ 


100 


0.704 




0.078 


0.696 




eur/usd 




1.2499 




0.045 


1.2578 




^gbp/usd 




1.8304 




0.066 


1.8375 




Correlation Matrix 


^jyeur/usd 




1 










A/ys^'p/'"^'^ 




0.5551 




1 






\\^eur/usd 




0.0148 




0.0101 


1 




^Ygbp/usd 




0.0119 




0.0075 


0.8093 


1 



Table 2: Descriptive statistics for EUR/USD and GBP/USD exchange rates and their implied 
volatilities. 
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Parameter 


Posterior mean 


Posterior SD 


Posterior median 


Ki 


0.153 


0.023 


0.153 


K2 


0.206 


0.030 


0.204 


m X 100 


0.677 


0.014 


0.677 


fl2 X 100 


0.689 


0.012 


0.690 


^3 


0.001 


0.053 


0.001 


//4 


0.019 


0.049 


0.019 


0-1 X 100 


0.343 


0.010 


0.343 


(72 X 100 


0.411 


0.013 


0.411 


P21 


0.567 


0.028 


0.567 


P43 


0.821 


0.011 


0.821 



Table 3: Summaries of the posterior draws of the model parameters for m = 60. EUR/USD 
and GBP/USD exchange rates dataset. 
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